304 research outputs found
A Generic Framework for Engineering Graph Canonization Algorithms
The state-of-the-art tools for practical graph canonization are all based on
the individualization-refinement paradigm, and their difference is primarily in
the choice of heuristics they include and in the actual tool implementation. It
is thus not possible to make a direct comparison of how individual algorithmic
ideas affect the performance on different graph classes.
We present an algorithmic software framework that facilitates implementation
of heuristics as independent extensions to a common core algorithm. It
therefore becomes easy to perform a detailed comparison of the performance and
behaviour of different algorithmic ideas. Implementations are provided of a
range of algorithms for tree traversal, target cell selection, and node
invariant, including choices from the literature and new variations. The
framework readily supports extraction and visualization of detailed data from
separate algorithm executions for subsequent analysis and development of new
heuristics.
Using collections of different graph classes we investigate the effect of
varying the selections of heuristics, often revealing exactly which individual
algorithmic choice is responsible for particularly good or bad performance. On
several benchmark collections, including a newly proposed class of difficult
instances, we additionally find that our implementation performs better than
the current state-of-the-art tools
Inferring Chemical Reaction Patterns Using Rule Composition in Graph Grammars
Modeling molecules as undirected graphs and chemical reactions as graph
rewriting operations is a natural and convenient approach tom odeling
chemistry. Graph grammar rules are most naturally employed to model elementary
reactions like merging, splitting, and isomerisation of molecules. It is often
convenient, in particular in the analysis of larger systems, to summarize
several subsequent reactions into a single composite chemical reaction. We use
a generic approach for composing graph grammar rules to define a chemically
useful rule compositions. We iteratively apply these rule compositions to
elementary transformations in order to automatically infer complex
transformation patterns. This is useful for instance to understand the net
effect of complex catalytic cycles such as the Formose reaction. The
automatically inferred graph grammar rule is a generic representative that also
covers the overall reaction pattern of the Formose cycle, namely two carbonyl
groups that can react with a bound glycolaldehyde to a second glycolaldehyde.
Rule composition also can be used to study polymerization reactions as well as
more complicated iterative reaction schemes. Terpenes and the polyketides, for
instance, form two naturally occurring classes of compounds of utmost
pharmaceutical interest that can be understood as "generalized polymers"
consisting of five-carbon (isoprene) and two-carbon units, respectively
Generic Strategies for Chemical Space Exploration
Computational approaches to exploring "chemical universes", i.e., very large
sets, potentially infinite sets of compounds that can be constructed by a
prescribed collection of reaction mechanisms, in practice suffer from a
combinatorial explosion. It quickly becomes impossible to test, for all pairs
of compounds in a rapidly growing network, whether they can react with each
other. More sophisticated and efficient strategies are therefore required to
construct very large chemical reaction networks.
Undirected labeled graphs and graph rewriting are natural models of chemical
compounds and chemical reactions. Borrowing the idea of partial evaluation from
functional programming, we introduce partial applications of rewrite rules.
Binding substrate to rules increases the number of rules but drastically prunes
the substrate sets to which it might match, resulting in dramatically reduced
resource requirements. At the same time, exploration strategies can be guided,
e.g. based on restrictions on the product molecules to avoid the explicit
enumeration of very unlikely compounds. To this end we introduce here a generic
framework for the specification of exploration strategies in graph-rewriting
systems. Using key examples of complex chemical networks from sugar chemistry
and the realm of metabolic networks we demonstrate the feasibility of a
high-level strategy framework.
The ideas presented here can not only be used for a strategy-based chemical
space exploration that has close correspondence of experimental results, but
are much more general. In particular, the framework can be used to emulate
higher-level transformation models such as illustrated in a small puzzle game
Maximizing Output and Recognizing Autocatalysis in Chemical Reaction Networks is NP-Complete
Background: A classical problem in metabolic design is to maximize the
production of desired compound in a given chemical reaction network by
appropriately directing the mass flow through the network. Computationally,
this problem is addressed as a linear optimization problem over the "flux
cone". The prior construction of the flux cone is computationally expensive and
no polynomial-time algorithms are known. Results: Here we show that the output
maximization problem in chemical reaction networks is NP-complete. This
statement remains true even if all reactions are monomolecular or bimolecular
and if only a single molecular species is used as influx. As a corollary we
show, furthermore, that the detection of autocatalytic species, i.e., types
that can only be produced from the influx material when they are present in the
initial reaction mixture, is an NP-complete computational problem. Conclusions:
Hardness results on combinatorial problems and optimization problems are
important to guide the development of computational tools for the analysis of
metabolic networks in particular and chemical reaction networks in general. Our
results indicate that efficient heuristics and approximate algorithms need to
be employed for the analysis of large chemical networks since even conceptually
simple flow problems are provably intractable
On the Realisability of Chemical Pathways
The exploration of pathways and alternative pathways that have a specific
function is of interest in numerous chemical contexts. A framework for
specifying and searching for pathways has previously been developed, but a
focus on which of the many pathway solutions are realisable, or can be made
realisable, is missing. Realisable here means that there actually exists some
sequencing of the reactions of the pathway that will execute the pathway. We
present a method for analysing the realisability of pathways based on the
reachability question in Petri nets. For realisable pathways, our method also
provides a certificate encoding an order of the reactions which realises the
pathway. We present two extended notions of realisability of pathways, one of
which is related to the concept of network catalysts. We exemplify our findings
on the pentose phosphate pathway. Lastly, we discuss the relevance of our
concepts for elucidating the choices often implicitly made when depicting
pathways.Comment: Accepted in LNBI proceeding
Automated group assignment in large phylogenetic trees using GRUNT: GRouping, Ungrouping, Naming Tool
<p>Abstract</p> <p>Background</p> <p>Accurate taxonomy is best maintained if species are arranged as hierarchical groups in phylogenetic trees. This is especially important as trees grow larger as a consequence of a rapidly expanding sequence database. Hierarchical group names are typically manually assigned in trees, an approach that becomes unfeasible for very large topologies.</p> <p>Results</p> <p>We have developed an automated iterative procedure for delineating stable (monophyletic) hierarchical groups to large (or small) trees and naming those groups according to a set of sequentially applied rules. In addition, we have created an associated ungrouping tool for removing existing groups that do not meet user-defined criteria (such as monophyly). The procedure is implemented in a program called GRUNT (GRouping, Ungrouping, Naming Tool) and has been applied to the current release of the Greengenes (Hugenholtz) 16S rRNA gene taxonomy comprising more than 130,000 taxa.</p> <p>Conclusion</p> <p>GRUNT will facilitate researchers requiring comprehensive hierarchical grouping of large tree topologies in, for example, database curation, microarray design and pangenome assignments. The application is available at the greengenes website <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p
- …